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Abstract 

GMEs are genetically modified enzybiotics created through molecular engineering approaches to deal with the increasing 
problem of antibiotic resistance prevalence. We present a fully manually curated database, GMEnzy, which focuses on GMEs 
and their design strategies, production and purification methods, and biological activity data. GMEnzy collects and 
integrates all available GMEs and their related information into one web based database. Currently GMEnzy holds 186 GMEs 
from published literature. The GMEnzy interface is easy to use, and allows users to rapidly retrieve data according to desired 
search criteria. GMEnzy's construction will increase the efficiency and convenience of improving these bioactive proteins for 
specific requirements, and will expand the arsenal available for researches to control drug-resistant pathogens. This 
database will prove valuable for researchers interested in genetically modified enzybiotics studies. GMEnzy is freely available 
on the Web at http://biotechlab.fudan.edu.cn/database/gmenzy/. 
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Introduction 

The need to develop new classes of antibiotics with novel 
mechanisms of action against drug-resistant pathogens is urgent, 
given the increasing prevalence of antibiotic resistance [1]. 
Thus, the Infectious Diseases Society of America (IDSA) has 
launched a new collaboration titled the "10 X '20" initiative to 
develop ten new antibacterial drugs by 2020 [2]. Enzybiotics, in 
light of their capability to kill susceptible organisms when 
applied exogenously as recombinant proteins, and because of 
their low probability for inducing bacterial resistance, have 
attracted much attention as potential substitutes for conven- 
tional antibiotics [3-6]. 

Enzybiotics primarily consist of endolysins and virion- 
associated peptidoglycan hydrolases from bacteriophages, bac- 
teriocins and autolysins from bacteria, and lysozymes [7-9]. A 
number of recent studies have demonstrated the strong potential 
enzybiotics have for controlling and treating infection caused by 
varied pathogens, especially gram-positive bacteria [7,10-19]. 
However, enzybiotics when applied externally, often require 
modification to improve their properties in specific applications, 



particularly in complex environments such as matrices, blood, 
or on mucous membranes [13]. Several molecular engineering 
techniques, including domain shuffling, sequence truncation, 
and random or rational mutagenesis, have been used to 
optimize enzybiotics by altering their binding or lytic capabil- 
ities [13,20]. Among these methods, domain exchange is unique 
to enzybiotics because of their modular architecture [10,21]. In 
the early 1990s, Diaz et al. were the earliest to do this, and 
demonstrated that shuffling functional domains between the 
pneumococcal autolysin LytA and the phage lysine Gpl-1, 
switches the catalytic activity and binding properties of the 
resulting chimeric enzymes [22]. We refer to these types of 
enzybiotics as genetically modified enzybiotics (GMEs). 

In general, enzybiotics are considered antimicrobial peptides 
on the basis of their protein nature and antibacterial activities. 
In fact, enzybiotics are different from antimicrobial peptides in 
both molecular size and antimicrobial mode of action. Although 
many databases [23-25] have been developed devoted to 
antimicrobial peptides, there are presently only two existing 
databases that focus on enzybiotics: EnzyBase [26] and 
phiBIOTICS [27]. The former contains 1144 enzybiotics from 
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216 natural sources, whereas the latter represents the knowledge 
encompassing 21 enzybiotics and 69 corresponding research 
studies. Moreover, no existing database provides information on 
how GMEs can be designed, produced, or obtained. With this 
observation, we developed the manually curated database, 
GMEnzy. 

GMEnzy focuses on GMEs and stores not only basic GME 
information but also design strategy, production and purification 
method, and activity data. GMEnzy will serve as a unique tool for 
GME studies. The construction of GMEnzy will enhance 
understanding the mechanism of GME action, and increase the 
efficiency and convenience of developing these bioactive proteins 
on demand. Eventually, GMEnzy could help reduce and/ or delay 
increases in antibiotic resistance worldwide. 

Results and Discussion 

Database Description 

GMEnzy was created as a useful resource for genetically 
modified enzybiotics studies. As a web based database, GMEnzy 
provides a user-friendly web interface for users to easily query and 
retrieve information on GMEs. All the data in GMEnzy can be 
accessed and retrieved directly from a web browser. All data were 
manually collected from scientific literature and public databases, 
or were calculated by computer programs. All GMEnzy data are 
categorized and stored in relational tables using MySQL. 

GMEnzy comprises eleven MySQL relational tables. Figure 1 
shows the schema of the database. The table GMEnzy contains 
the protein definition, function, and physiochemical properties 
information fields. Reference information related to each GME is 
stored in Reference and Reflink. Information related to creating 
and producing GMEs is contained in Strategy and Production 
respectively. The features of each GME and its annotation are 
included in Feature. The parent enzybiotics which GMEs derived 
from and their annotations are involved in Source and GMlink 
respectively. Available protein sequences are contained in the table 
Sequence. Antimicrobial activity data in vitro or in vivo for each 
GME are stored in Activity. Users can add comments to any of the 
GMEs, and these comments are added to the Comment table. 



Database Interfaces 

A concise navigational interface that contains the database 
Browse, Search, Tools, Statistical Info, and Guide options was 
designed to generate a clearly structured database layout that 
enables fast and easy navigation (Figure 2). 

The Browse interface allows users to navigate not only the entire 
database, but also grouped GMEs classified by specified creation 
strategies. Moreover, the Browse interface contains an additional 
link that allows the download of all GMEs in FASTA format. 

The Search interface can be used to retrieve specific informa- 
tion using either a Quick or Advanced option. A Quick search 
only allows keywords, while the Advanced search accepts the 
specification of up to six separate fields: GM Id, protein name, 
parent enzybiotic name, modified strategy, domains, and target 
organism. The user can query the database by either one 
particular condition or a combination of various conditions. In 
addition, the search result page provides the function for quick 
alignment of the results by the GlustalW [28] software when the 
results more than two. Each GME Search produces a results page 
with seven sections: summary, construction, annotation, produc- 
tion, activity, references, and comments (Figure 2). Summary 
information consists of enzybiotic name, construction strategy, and 
resulting effect. In addition, summary information also provides 
basic physiochemical properties such as sequence length, calcu- 
lated protein mass and isoelectric point (pi), and simple functional 
annotation. Construction information demonstrates how the GME 
derived from parent proteins and its correlation with functional 
alterations. The details information describes the parent protein 
which the GME derived from and the links to databases describing 
them. The annotation section demonstrates all components within 
each GME along with brief descriptions and lists function 
annotations and the deduced sequence. Moreover, a link to BlastP 
againt GMEnzy was added below the sequence. Production, 
activity, and reference information related to each GME is also 
listed below. Furthermore, a comment form at the bottom of the 
page provides users the option to submit comments for the 
displayed GME to facilitate the exchange of information among 
researchers. 

The Tools interface implements the ClustalW and BlastP 
functions. The ClustalW function provides a simple ClustalW 
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Figure 1. Schema of the GMEnzy database. 

doi:1 0.1 371 /journal.pone.01 03687.g001 
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Quick Search 



Summary 

The genetically modified enzybiotic, named ClyS , constructed by Domains Assembly strategy, change or extend lytic 

spectrum. 

With ClyS 

Sequence Length: 280 AA. 

Mass: 31998 Da. 

Isoelectric Point: 9.76 

Function: ClyS lysed MRSA, vancomycin-intermediate strains of S. aureus (VISA), and methicillin-sensitive (MSSA) strains of S. 
aureus in vitro. In a mouse nasal decolonization model, a 2-log reduction in the viability of MRSA cells was seen 1 h following a single 
treatment with ClyS. One intraperitoneal dose of ClyS also protected against death by MRSA in a mouse septicemia model. 

Construction 

GM0001 , constructed by Domains Assembly strategy. 
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N-acetylmuramoyl-L-alanine amidase 


A0EX11: Amidase 
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GM0001 [1 87-280]! derive from protein A0EX1 1 sequence 1 58-251 
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GM000l[1-1 86]: derive from protein 056788 sequence 1-186 
> 05*788. 487 AA.. Ill* N-acetylmuramoyl-L-alanine amidase from Staphylococcus phage Twort 



Annotation 
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1-186 : EAD, CHAP domain, derived from S. aureus Twort phage lysin; 

187-280 : cbd, Cell wall-targeting domain, derived from S. aureus phage lysin (phiNM3); 
Domain and repeats: 

4-141 : IPR007921.CHAP domain; 
GO term prediction: 




_LJ 



Detail Page 



MKTLKQAESYIKSKVNTGTDFDGLYGYQCMDLAVDYIYHVTDGKIRMWGNA 
KDAINNSFGGTATVYKNYPAFRPKYGDVVVWTTGNFATYGHIAIVTNPDPY 
GDLQYVTVLEQNWNGNGIYKTELATIRTHDYTGITHFIRPNFATESSVKKK 
DTKKKPKPSNRDGINKDKIVYDRTNINYNMVKR-GKSASKITVGSKAPYNL 
KWSKGAYFNAKIDGLGATSATRYGDNRTNYRFDVGQAVYAPGTLI YVFEII 
DGWCRI YWNNHNEWIWHERLIVKEVF 



BlastP to GMEnzy 

Production 

1. Expressed by pJML6 in E. coli DH5Alpha, Purified by CM-Sepharose column 

Activity 

1 . In vitro activity of ClyS test on Staphylococcus aureus (S. aureus 8325-4 cells) by the turbidity-reduction assays 
showed the OD600 dropped 3- to 4-fold within -15 min 

Figure 2. Screen shots of a GME result page. The screen shots show the advanced search and result views. Please note that not all fields are shown. 
doi:1 0.1 371/journal.pone.01 03687.g002 
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Figure 3. Distribution of calculated isoelectric points for the 
GMEs in GMEnzy. Every bar indicates the number of GMEs calculated 
to have their isoelectric point range from pi- 1 to pi. 
doi:1 0.1 371 /journal.pone.01 03687.g003 



service for users to alignment the GME by ClustalW software and 
a link to EBI ClustalW which supplied more options for multiple 
sequence alignment. The BlastP function provides only the 
BLASTP against GMEnzy service and a link to NCBI BLASTP 
for users who want to blast full datasets in NCBI. 

The Statistical info interface supplies statistical data on the 
physiochemical properties, production, and effect information of 
the GMEs. (see 'Statistical description and findings' section below 
for further details). 

The Guide interface provides simple instructions to potential 
users on how to use GMEnzy functions. 

Statistical Description and Findings 

The current version of GMEnzy contains 186 GMEs. Most of 
GMEs in GMEnzy are novel sequences. We used BLASTP 
[29,30] to align full length GME against UniProt [31], and only 1 7 
sequences showed identical hits from UniProt. The lengths of the 
GME sequences range from 35 to 879 amino acids. The 
distribution of pis for all the GMEs is shown in Figure 3. The 
majority (98.4%) of the GMEs have calculated pis that range from 
5 to 11, with most of these being basic, rather than acidic. 

Presently molecular engineering approaches for GME con- 
struction mainly include domain exchange, truncation, random 



and/or rational mutagenesis, and fusion to peptides, although 
specific modification methods are varied [13]. Thus, we catego- 
rized all GMEs into four groups by construction strategy. Figure 4 
demonstrates these categories and the proportion of GMEs in 
each. In GMEnzy the truncation and mutagenesis method 
accounts for most (79%, 147/186) of the cases. The domain 
assembly method includes 20% (37/186) of the cases, while the 
fusion to peptides modification only contains 1% (2/186) of the 
cases (Figure 4). This distribution is in accord with the current 
state of GME research. In the past GMEs were mainly created for 
the control of gram-positive bacterial infections. More recently the 
fusion to peptide approach was designed as a new orientation for 
killing gram-negative pathogens. Furthermore, Diez-Martmez et 
al. have introduced an efficient method for improving the lethal 
effect of Cpl-7 by inverting the net charge of its cell wall-binding 
module [20] . This technique could be adopted to help enzybiotics 
fight gram-negative pathogens. Finally, about 27% (50/186) of the 
entries in the database are GMEs that exhibit ideal or improved 
properties, such as increased lytic activity or an extended lytic 
spectrum, after modification (Table 1). Our data suggest that 
among these approaches, domain assembly modification exhibits 
91% (34/38) positive results, and is the best method for genetic 
modification (other than the fusion to peptides technique, which 
has only two cases). 

Bacteria are the most commonly used expression host, 
according the data in our database. Bacteria are hosts in more 
than 99% of all cases, while only one case used S. cerevisiae 
(Figure 5). The bacterium E. coli is the most popular expression 
host because of attractive features of the organism itself (e.g. fast 
growth, low cost, and convenient operation). Additionally, to 
facilitate rapid purification, most (73%, 136/184) of the GMEs are 
normally expressed as fusion proteins with either histidine (His) or 
glutathione-S-transferase (GST) purification tags (Figure 6), as well 
as data not disclosed. Ion-exchange chromatography and size 
exclusion chromatography purification techniques represent 19% 
(35/184) and 1% (2/184) of all GMEs respectively. The traditional 
ammonium sulfate precipitation method has almost been aban- 
doned, as only one case in our database uses it to purify a GME. 

Limitations and Future Prospects 

Currently GMEnzy only holds 186 GMEs. Moreover, because 
of rapidly emerging drug-resistance in bacteria, GMEs have 



Mutagenesis (59) : 32 




Figure 4. Methods for GME construction. 

doi:1 0.1 371 /journal.pone.01 03687.g004 
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Table 1. Strategies for GME construction and their resulting effect. 





Strategy 


Effect 


Nums of GMEs 


Positive rate (%)* 


Domains assembly 


Change or extend lytic spectrum 


18 


91 




Improve the lytic activity 


10 






Enhance the solubility, change or 
extend lytic spectrum 


2 




Lose the activities of cell targeting 1 




Enhance the solubility 


1 




Decrease the lytic activity 1 


Keep the lytic activity 1 


Trunction 


Decrease the lytic activity 


38 


16 




Improve the lytic activity 


14 






Lose the activities of cell targeting 


14 




Lose the lytic activity 8 




Keep the lytic activity 


7 






Keep the activities of cell targeting 


6 






Undisclosed 


2 




Mutagensis 


Decrease the lytic activity 


26 


5 




Keep the lytic activity 


25 






Decrease the binding activity 


4 






Change or extend lytic spectrum 


2 






Lose the activities of cell targeting 


2 




Keep the activities of cell targeting 1 




Improve the lytic activity 


1 




Fusion to peptides 


Change or extend lytic spectrum 


2 


100 



*, positive rate indicates the proportion of GMEs with improving lytic activity or extending lytic spectrum (with bold font) to all calculated GMEs. 
doi:1 0.1 371/journal.pone.01 03687.t001 



attracted increasing interest as potential antimicrobial agents. 
Therefore, available data will rapidly increase. We will collect each 
new GME's activity information and related data and incorporate 
them into GMEnzy by updating the database twice a year in the 
future. 

Conclusions 

GMEnzy is a web accessible database that focuses on the 
genetically modified enzybiotics. The current version of GMEnzy 
has 186 entries (as of January, 2014). The database can be queried 
either by simply using keywords or by combinatorial conditions 
searches. GMEnzy will not only aid in expanding our current 
understanding of GMEs and their mechanisms of action, but may 
have implications in the development of new GMEs to fight 
against increasingly drug-resistant pathogens. GMEnzy is now 
available at http://biotechlab.fudan.edu.cn/database/gmenzy/. 

Materials and Methods 

Data Collection and Organization 

We collected all available GME information from scientific 
literature by searching the Web of Science and Google scholar 
with the following query: (keyword: "hydrolyse" OR "endolysin" 
OR "lysine" OR "lysozyme" OR "enzybiotics" AND "fusion" 
OR "chimeric"), and also by exploring the references listed in the 



literature discovered with this first pass. Each citation returned 
from the search was further checked manually to assure it 
contained the desired information. All GMEs abstracted from the 
literature were classified into four groups based on the protein 
engineering strategies: domain assembly, truncation, mutagenesis, 
and fusion to peptides. All GME sequences were constructed from 
the UniProt database according to descriptions from the literature. 
Additional physicochemical data for each GME was either 
calculated via Bioperl programs or obtained from the literature. 
All of the collected information was classified and filled into eleven 
relational tables in MySQL. GMEnzy currently contains 186 
GMEs acquired from more than 100 published papers. 

Web Interface and Application 

GMEnzy was built on a 64 bit Windows (2008 R2) server 
running WAMPSERVER (V2.2d), which integrates the Apache 
HTTP Server (V2.2.21) with PHP (V5.3.10) and the MySQL 
Server (V5.5.20). All entries are stored in a MySQL database. The 
application was coded in PHP, using the j Query JavaScript 
Library (VI. 6. 2), the Highchart jQuery Plugin, and Cascading 
Style Sheets (CSS) for the web design. Apache, MySQL, PHP, and 
jQuery were preferred as they are open-source software and 
platform independent, making them suitable for academic use. 
The web server and all parts of the database are hosted at the 
Information Office of Fudan University, Shanghai, China. 
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^ E. coli (184): 98% 

Figure 5. Hosts for producing GMEs. 

doi:1 0.1 371 /journal.pone.01 03687.g005 



size exclusion chromatography (2) : 1 % 
Not disclosed (12) : 6 % 

ion exchange chromatography (35): 19 %" 
ammonium sulphate precipitation (1): 1 % 




affinity chromatography (136) : 73 % 



Figure 6. Purification methods for GMEs product. 

doi:1 0.1 371 /journal.pone.01 03687.g006 
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